Leveraging Multilingual Training for Limited Resource Event Extraction

نویسندگان

  • Andrew Hsi
  • Yiming Yang
  • Jaime G. Carbonell
  • Ruochen Xu
چکیده

Event extraction has become one of the most important topics in information extraction, but to date, there is very limited work on leveraging cross-lingual training to boost performance. We propose a new event extraction approach that trains on multiple languages using a combination of both language-dependent and language-independent features, with particular focus on the case where target domain training data is of very limited size. We show empirically that multilingual training can boost performance for the tasks of event trigger extraction and event argument extraction on the Chinese ACE 2005 dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Data Selection for Low Resource Speech Recognition

Feature representations extracted from deep neural networkbased multilingual frontends provide significant improvements to speech recognition systems in low resource settings. To effectively train these frontends, we introduce a data selection technique that discovers language groups from an available set of training languages. This data selection method reduces the required amount of training ...

متن کامل

Bottle-Neck Feature Extraction Structures for Multilingual Training and Porting

Stacked-Bottle-Neck (SBN) feature extraction is a crucial part of modern automatic speech recognition (ASR) systems. The SBN network traditionally contains a hidden layer between the BN and output layers. Recently, we have observed that an SBN architecture without this hidden layer (i.e. direct BN-layer – output-layer connection) performs better for a single language but fails in scenarios wher...

متن کامل

Improving Deliverable Speech-to-Text Systems with Multilingual Knowledge Transfer

This paper reports our recent progress on using multilingual data for improving speech-to-text (STT) systems that can be easily delivered. We continued the work BBN conducted on the use of multilingual data for improving Babel evaluation systems, but focused on training time-delay neural network (TDNN) based chain models. As done for the Babel evaluations, we used multilingual data in two ways:...

متن کامل

Investigation of bottleneck features and multilingual deep neural networks for speaker verification

Recently, the integration of deep neural networks (DNNs) with i-vector systems is proved to be effective for speaker verification. This method uses the DNN with senone outputs to produce frame alignments for sufficient statistics extraction. However, two types of data mismatch may degrade the performance of the DNN-based speaker verification systems. First, the DNN requires transcribed training...

متن کامل

Towards Multilingual Event Extraction Evaluation: A Case Study for the Czech Language

This paper presents a multilingual corpus of news, annotated with event metadata information. The events in our corpus are from the domain of violence, natural and man made disasters. The main goal of the corpus is automatic evaluation of event detection and extraction systems in different languages. As a use case, we take a rulebased event extraction system, extend it to cover a new language, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016